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(57) Abstract: The present invention relates to novel proteins. More specifically, isolated nucleic acid molecules are provided 
encoding novel polypeptides. Novel polypeptides and antibodies that bind to these polypeptides are provided. Also provided are 
vectors, host cells, and recombinant and synthetic methods for producing human polynucleotides and/or polypeptides, and antibod- 
ies. The invention further relates to diagnostic and therapeutic methods useful for diagnosing, treating, preventing and/or prognosing 
disorders related to these novel polypeptides. The invention further relates to screening methods for identifying agonists and antag- 
onists of polynucleotides and polypeptides of the invention. The present invention further relates to methods and/or compositions 
for inhibiting or enhancing the production and function of the polypeptides of the present invention. 
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Nucleic Acids, Proteins, and Antibodies 



[1] This application refers to a "Sequence Listing" that is provided only on electronic 

media in computer readable form pursuant to Administrative Instructions Section 801(a)(i). 
The Sequence Listing forms a part of this description pursuant to Rule 5.2 and 
Administrative Instructions Sections 801 to 806, and is hereby incorporated in its entirety. 
[2] The Sequence Listing is provided as an electronic file (PTZ15PCT_seqList.txt, 

1,891,228 bytes in size, created on January 13, 2001) on four identical compact discs (CD- 
R), labeled "COPY 1," "COPY 2," "COPY 3," and "CRF." The Sequence Listing complies 
with Annex C of the Administrative Instructions, and may be viewed, for example, on an 
IBM-PC machine running the MS-Windows operating system by using the V viewer 
software, version 2000 (see World Wide Web URL: http://www.fileviewer.com). 



Field of the Invention 

[3] The present invention relates to novel proteins. More specifically, isolated nucleic 

acid molecules are provided encoding novel polypeptides. Novel polypeptides and antibodies 
that bind to these polypeptides are provided. Also provided are ^vectors, host cells, and 
recombinant and synthetic methods for producing human polynucleotides and/or 
polypeptides, and antibodies. The invention further relates to diagnostic and therapeutic 



1 



WO 01/55326 



PCT/US01/01347 



methods useful for diagnosing, treating, preventing and/or prognosing disorders related to 
these novel polypeptides. The invention further relates to screening methods for identifying 
agonists and antagonists of polynucleotides and polypeptides of the invention. The present 
invention further relates to methods and/or compositions for inhibiting or enhancing the 
production and function of the polypeptides of the present invention. 

Background of the Invention 

[4] The Human genome is estimated to contain roughly 100,000 genes, each of which 
plays an important function in sustaining life. Each of these roughly 100,000 genes encodes 
for a corresponding protein which can be classified based upon its structure and/or function. 
Some proteins are secreted, while other proteins reside either as membrane associated 
proteins or intracellularly. Although protein sequences vary substantially, many patterns and 
overall properties are shared, such as, for example, amino-terminal signal sequences. 
[5] Some proteins, for example secreted proteins, contain an amino-terminal signal 
sequence which facilitates protein transport. This amino-terminal signal sequence directs, or 
targets, the protein from its ribosomal assembly site to a particular cellular or extracellular 
location. Transport may involve any combination of several of the following steps: contact 
with a chaperone, unfolding, interaction with a receptor and/or a pore complex, addition of 
energy, and refolding. Moreover, an extracellular protein may be produced as an inactive 
precursor. Once the precursor has been exported, removal of the signal sequence by a signal 
peptidase activates the protein. Examples of some protein families that contain signal 
sequences include cytokines (chemokines) and hormones (growth and differentiation factors). 
Computer algorithms can be generated to identify amino-terminal signal sequences. 
Examples of computer programs designed to identify amino-terminal signal sequences 
include hidden Markov models (HMMs), statistical alternatives to FASTA and Smith 
Waterman algorithms, which have been used to find shared patterns, specifically consensus 
sequences (Pearson, W.R., and DJ. Lipman, PNAS, 85:2444-48 (1988); Smith, T.F., and 
M.S. Waterman, 1 Mol Biol, 147:195-97 (1981)). These algorithms are quite flexible in that 
they incorporate information from newly identified sequences to build even more successful 
patterns. 

[61 Other families of proteins exist as membrane associated proteins. Examples of 
some of these membrane associated protein families include receptors (nuclear, 4 
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transmembrane, G protein coupled, and tyrosine kinase), protein kinases, phosphatases, 
neuropeptides and vasomediators, G proteins, ion channels (calcium, chloride, potassium, 
and sodium), proteases, transporter/pumps (amino acid, sugar, protein, metal and vitamin; 
calcium, phosphate, potassium, and sodium), matrix molecules (adhesion, cadherin, 
extracellular matrix molecules, integrin, and selectin), and regulatory proteins. Again, 
computer programs can aid in the discovery of these molecules. For example, Klein et al. 
have developed a method ("ALOM", also called as KKD) to detect potential transmembrane 
segments in polypeptides (Klein, P., et al., Biochim. Piophys. Acta., 815:468 (1985)). It 
attempts to identify the most probable transmembrane segment from the average 
hydrophobicity value over a range of amino acid residues. It predicts whether the segment is 
a transmembrane segment (INTEGRAL) or not (PERIPHERAL), and thus can suggest 
membrane association of a polypeptide. 

[7] Furthermore, some proteins function intracellularly, and can be identified by their 
structure and/or function. Computer algorithms can be adapted to aid in the identification of 
novel members of intracellular protein families. Examples of intracellular proteins include 
transcription factors, various classes of enzymes, Mitochondrial proteins, and signal 
transduction molecules. 

[8] Descriptions of some of these proteins (e.g., receptors, hormones, and matrix 
proteins) and diseases associated with their dysfunction follow. 

Summary of the Invention 

[9] The present invention relates to novel proteins. More specifically, isolated nucleic 

acid molecules are provided encoding novel polypeptides. Novel polypeptides and antibodies 
that bind to these polypeptides are provided. Also provided are vectors, host cells, and 
recombinant and synthetic methods for producing human polynucleotides and/or 
polypeptides, and antibodies. The invention further relates to diagnostic and therapeutic 
methods useful for diagnosing, treating* preventing and/or prognosing disorders related to 
these novel polypeptides. The invention further relates to screening methods for identifying 
agonists and antagonists of polynucleotides and polypeptides of the invention. The present 
invention further relates to methods and/or compositions for inhibiting or enhancing the 
production and function of the polypeptides of the present invention. 
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Detailed Description 

Tables 

[10] Table 1A summarizes some of the polynucleotides encompassed by the invention 
(including cDNA clones related to the sequences (Clone ID NO:Z), contig sequences (contig 
identifier (Contig ID:) and contig nucleotide sequence identifier (SEQ ID NO:X)) and further 
summarizes certain characteristics of these polynucleotides and the polypeptides encoded 
thereby. The first column provides the gene number in the application for each clone 
identifier. The second column provides a unique clone identifier, "Clone ED NO:Z", for a 
cDNA clone related to each contig sequence disclosed in Table 1A. The third column 
provides a unique contig identifier, "Contig ED:" for each of the contig sequences disclosed 
in Table 1 A. The fourth column provides the sequence identifier, "SEQ ID NO:X", for each 
of the contig sequences disclosed in Table 1A. The fifth column, "ORF (From-To)", 
provides the location (i.e., nucleotide position numbers) within the.polynucleotide sequence 
of SEQ ID NO:X that delineate the preferred open reading frame (ORF) that encodes the 
amino acid sequence shown in the sequence listing and referenced in Table 1A as SEQ ID 
NO:Y (column 6). Column 7 lists residues comprising predicted epitopes contained in the 
polypeptides encoded by each of the preferred ORFs (SEQ ID NO:Y). Identification of 
potential immunogenic regions was performed according to the method of Jameson and Wolf 
(CABIOS, 4; 181-186 (1988)); specifically, the Genetics Computer Group (GCG) 
implementation of this algorithm, embodied in the program PEPTIDESTRUCTURE 
(Wisconsin Package vlO.0, Genetics Computer Group (GCG), Madison, Wise). This method 
returns a measure of the probability that a given residue is found on the surface of the protein. 
Regions where the antigenic index score is greater than 0.9 over at least 6 amino acids are 
indicated in Table 1A as "Predicted Epitopes". In particular embodiments, polypeptides of 
the invention comprise, or alternatively consist of, one, two, three, four, five or more of the 
predicted epitopes described in Table 1A. It will be appreciated that depending on the 
analytical criteria used to predict antigenic determinants, the exact address of the determinant 
may vary slightly. Column 8, "Tissue Distribution" shows the expression profile of tissue, 
cells, and/or cell line libraries which express the polynucleotides of the invention. The first 
number in column 8 (preceding the colon), represents the tissue/cell source identifier code 
corresponding to the key provided in Table 4. Expression of these polynucleotides was not 
observed in the other tissues and/or cell libraries tested. For those identifier codes in which 
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the first two letters are not "AR", the second number in column 8 (following the colon), 
represents the number of times a sequence corresponding to the reference polynucleotide 
sequence (e.g., SEQ ID NO:X) was identified in the tissue/cell source. Those tissue/cell 
source identifier codes in which the first two letters are "AR" designate information 
generated using DNA array technology. Utilizing this technology, cDNAs were amplified by 
PCR and then transferred, in duplicate, onto the array. Gene expression was assayed through 
hybridization of first strand cDNA probes to the DNA array. cDNA probes were generated 
from total RNA extracted from a variety of different tissues and cell lines. Probe synthesis 
was performed in the presence of 33 P dCTP, using oligo(dT) to prime reverse transcription. 
After hybridization, high stringency washing conditions were employed to remove non- 
specific hybrids from the array. The remaining signal, emanating from each gene target, was 
measured using a Phosphorimager. Gene expression was reported as Phosphor Stimulating 
Luminescence (PSL) which reflects the level of phosphor signal generated from the probe 
hybridized to each of the gene targets represented on the array. A local background signal 
subtraction was performed before the total signal generated from each array was used to 
normalize gene expression between the different hybridizations. The value presented after 
"[array code]:" represents the mean of the duplicate values, following background subtraction 
and probe normalization. One of skill in the art could routinely use this information to 
identify normal and/or diseased tissue(s) which show a predominant expression pattern of the 
corresponding polynucleotide of the invention or to identify polynucleotides which show 
predominant and/or specific tissue and/or cell expression. Column 9 provides the 
chromosomal location of polynucleotides corresponding to SEQ ID NO:X. Chromosomal 
location was determined by finding exact matches to EST and cDNA sequences contained in 
the NCBI (National Center for Biotechnology Information) UniGene database. Given a 
presumptive chromosomal location, disease locus association was determined by comparison 
with the Morbid Map, derived from Online Mendelian Inheritance in Man (Online Mendelian 
Inheritance in Man, OMIM™. MpKusick-Nathans Institute for Genetic Medicine, Johns 
Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, 
National Library of Medicine (Bethesda, MD) 2000. World Wide Web URL: 
http://www.ncbi.nlm.nih.gov/omim/). If the putative chromosomal location of the Query 
overlaps with the chromosomal location of a Morbid Map entry, an OMIM identification 
number is disclosed in column 10 labeled "OMIM Disease Reference(s)". A key to the 
OMIM reference identification numbers is provided in Table 5. 
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[11] Table IB summarizes additional polynucleotides encompassed by the invention 
(including cDNA clones related to the sequences (Clone ID NO:Z), contig sequences (contig 
identifier (Contig ID:) contig nucleotide sequence identifiers (SEQ ID NO:X)), and genomic 
sequences (SEQ ID NO:B). The first column provides a unique clone identifier, "Clone ID 
NO:Z", for a cDNA clone related to each contig sequence. The second column provides the 
sequence identifier, "SEQ ID NO:X", for each contig sequence. The third column provides a 
unique contig identifier, "Contig ID:" for each contig sequence. The fourth column, provides 
a BAC identifier "B AC ID NO: A" for the BAC clone referenced in the corresponding row of 
the table. The fifth column provides the nucleotide sequence identifier, "SEQ ID NO:B" for 
a fragment of the BAC clone identified in column four of the corresponding row of the table. 
The sixth column, "Exon From-To", provides the location (i.e., nucleotide position numbers) 
within the polynucleotide sequence of SEQ ID NO:B which delineate certain polynucleotides 
of the invention that are also exemplary members of polynucleotide sequences that encode 
polypeptides of the invention (e.g., polypeptides containing amino acid sequences encoded 
by the polynucleotide sequences delineated in column six, and fragments and variants 
thereof). 

[12] Table 2 summarizes homology and features of some of the polypeptides of the 
invention. The first column provides a unique clone identifier, "Clone ID NO:Z", 
corresponding to a cDNA clone disclosed in Table 1A. The second column provides the 
unique contig identifier, "Contig ID:" corresponding to contigs in Table 1 A and allowing for 
correlation with the information in Table 1A. The third column provides the sequence 
identifier, "SEQ ID NO:X", for the contig polynucleotide sequence. The fourth column 
provides the analysis method by which the homology/identity disclosed in the Table was 
determined. Comparisons were made between polypeptides encoded by the polynucleotides 
of the invention and either a non-redundant protein database (herein referred to as "NR"), or 
a database of protein families (herein referred to as "PFAM") as further described below. 
The fifth column provides a description of the PFAM/NR hit having a significant match to a 
polypeptide of the invention. Column six provides the accession number of the PFAM/NR 
hit disclosed in the fifth column. Column seven, "Score/Percent Identity", provides a quality 
score or the percent identity, of the hit disclosed in columns five and six. Columns 8 and 9, 
"NT From" and "NT To" respectively, delineate the polynucleotides in "SEQ ID NO:X" that 
encode a polypeptide having a significant match to the PFAM/NR database as disclosed in 
the fifth and sixth columns. In specific embodiments polypeptides of the invention comprise, 
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or alternatively consist of, an amino acid sequence encoded by a polynucleotide in SEQ ID 
NO:X as delineated in columns 8 and 9, or fragments or variants thereof. 
[13] Table 3 provides polynucleotide sequences that may be disclaimed according to 
certain embodiments of the invention. The first column provides a unique clone identifier, 
"Clone ID", for a cDNA clone related to contig sequences disclosed in Table 1A. The 
second column provides the sequence identifier, "SEQ ID NO:X", for contig sequences 
disclosed in Table 1 A. The third column provides the unique contig identifier, "Contig ID:", 
for contigs disclosed in Table 1A. The fourth column provides a unique integer V where c a' 
is any integer between 1 and the final nucleotide minus 15 of SEQ ID NO:X, and the fifth 
column provides a unique integer V where c b a is any integer between 15 and the final 
nucleotide of SEQ ED NO:X, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:X, and where b is greater than or equal to a + 14. For each of 
the polynucleotides shown as SEQ ID NO.X, the uniquely defined integers can be substituted 
into the general formula of a-b, and used to describe polynucleotides which may be 
preferably excluded from the invention. In certain embodiments, preferably excluded from 
the invention are at least one, two, three, four, five, ten, or more of the polynucleotide 
sequence(s) having the accession number(s) disclosed in the sixth column of this Table 
(including for example, published sequence in connection with a particular BAC clone). In 
further embodiments, preferably excluded from the invention are the specific polynucleotide 
sequence(s) contained in the clones corresponding to at least one, two, three, four, five, ten, 
or more of the available material having the accession numbers identified in the sixth column 
of this Table (including for example, the actual sequence contained in an identified BAC 
clone). 

[14] Table 4 provides a key to the tissue/cell source identifier code disclosed in Table 
1 A, column 8. Column 1 provides the tissue/cell source identifier code disclosed in Table 1 A, 
Column 8. Columns 2-5 provide a description of the tissue or cell source. Codes 
corresponding to diseased tissues are indicated in column 6 with the word "disease". The use 
of the word "disease" in column 6 is non-limiting. The tissue or cell source may be specific 
(e.g. a neoplasm), or may be disease-associated (e.g., a tissue sample from a normal portion 
of a diseased organ). Furthermore, tissues and/or cells lacking the "disease" designation may 
still be derived from sources directly or indirectly involved in a disease state or disorder, and 
therefore may have a further utility in that disease state or disorder. In numerous cases where 
the tissue/cell source is a library, column 7 identifies the vector used to generate the library. 
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[15] Table 5 provides a key to the OMIM reference identification numbers disclosed in 
Table 1A, column 10. OMIM reference identification numbers (Column 1) were derived 
from Online Mendelian Inheritance in Man (Online Mendelian Inheritance in Man, OMIM. 
McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, 
MD) and National Center for Biotechnology Information, National Library of Medicine, 
(Bethesda, MD) 2000. World Wide Web URL: http://www.ncbi.nlm.nih.gov/omim/). 
Column 2 provides diseases associated with the cytologic band disclosed in Table 1A, 
column 9, as determined using the Morbid Map database. 

[16] Table 6 summarizes ATCC Deposits, Deposit dates, and ATCC designation 
numbers of deposits made with the ATCC in connection with the present application. 
[17] Table 7 shows the cDNA libraries sequenced, and ATCC designation numbers and 
vector information relating to these cDNA libraries. 

[18] Table 8 provides a physical characterization of clones encompassed by the 
invention. The first column provides the unique clone identifier, "Clone ID NO:Z", for 
certain cDNA clones of the invention, as described in Table 1 A. The second column provides 
the size of the cDNA insert contained in the corresponding cDNA clone. 

Definitions 

[19] The following definitions are provided to facilitate understanding of certain terms 
used throughout this specification. 

[20] In the present invention, "isolated" refers to material removed from its original 
environment (e.g., the natural environment if it is naturally occurring), and thus is altered "by 
the hand of man" from its natural state. For example, an isolated polynucleotide could be 
part of a vector or a composition of matter, or could be contained within a cell, and still be 
"isolated" because that vector, composition of matter, or particular cell is not the original 
environment of the polynucleotide. The term "isolated" does not refer to genomic or cDNA 
libraries, whole cell total or mRNA preparations, genomic DNA preparations (including 
those separated by electrophoresis and transferred onto blots), sheared whole cell genomic 
DNA preparations or other compositions where the art demonstrates no distinguishing 
features of the polynucleotide/sequences of the present invention. 

[21] As used herein, a "polynucleotide" refers to a molecule having a nucleic acid 
sequence encoding SEQ ID NO:Y or a fragment or variant thereof; a nucleic acid sequence 
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contained in SEQ ID NO:X (as described in column 3 of Table 1A) or the complement 
thereof; a cDNA sequence contained in Clone ID NO:Z (as described in column 2 of Table 
1 A and contained within a library deposited with the ATCC); a nucleotide sequence encoding 
the polypeptide encoded by a nucleotide sequence in SEQ ID NO:B as defined in column 6 
of Table IB or a fragment or variant thereof; or a nucleotide coding sequence in SEQ ID 
NO:B as defined in column 6 of Table IB or the complement thereof. For example, the 
polynucleotide can contain the nucleotide sequence of the full length cDNA sequence, 
including the 5* and 3* untranslated sequences, the coding region, as well as fragments, 
epitopes, domains, and variants of the nucleic acid sequence. Moreover, as used herein, a 
"polypeptide" refers to a molecule having an amino acid sequence encoded by a 
polynucleotide of the invention as broadly defined (obviously excluding poly-Phenylalanine 
or poly-Lysine peptide sequences which result from translation of a polyA tail of a sequence 
corresponding to a cDNA). 

[22] In the present invention, "SEQ ID NO:X" was often generated by overlapping 
sequences contained in multiple clones (contig analysis). A representative clone containing 
all or most of the sequence for SEQ ID NO:X is deposited at Human Genome Sciences, Inc. 
(HGS) in a catalogued and archived library. As shown, for example, in column 2 of Table 
1A, each clone is identified by a cDNA Clone ID (identifier generally referred to herein as 
Clone ID NO:Z). Each Clone ID is unique to an individual clone and the Clone ID is all the 
information needed to retrieve a given clone from the HGS library. Furthermore, certain 
clones disclosed in this application have been deposited with the ATCC on October 5, 2000, 
having the ATCC designation numbers PTA 2574 and PTA 2575; and on January 5, 2001, 
having the depositor reference numbers TS-1, TS-2, AC-1, and AC-2. In addition to the 
individual cDNA clone deposits, most of the cDNA libraries from which the clones were 
derived were deposited at the American Type Culture Collection (hereinafter "ATCC"). 
Table 7 provides a list of the deposited cDNA libraries. One can use the Clone ID NO:Z to 
determine the library source by reference to Tables 6 and 7. Table 7 lists the deposited 
cDNA libraries by name and links each library to an ATCC Deposit. Library names contain 
four characters, for example, "HTWE." The name of a cDNA clone (Clone ID) isolated from 
that library begins with the same four characters, for example "HTWEP07". As mentioned 
below, Table 1 A correlates the Clone ID names with SEQ ID NO:X. Thus, starting with an 
SEQ ID NO:X, one can use Tables 1, 6 and 7 to determine the corresponding Clone ID, 
which library it came from and which ATCC deposit the library is contained in. Furthermore, 
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it is possible to retrieve a given cDNA clone from the source library by techniques known in 
the art and described elsewhere herein. The ATCC is located at 10801 University Boulevard, 
Manassas, Virginia 20110-2209, USA. The ATCC deposits were made pursuant to the terms 
of the Budapest Treaty on the international recognition of the deposit of microorganisms for 
the purposes of patent procedure. 

[23] In specific embodiments, the polynucleotides of the invention are at least 15, at 
least 30, at least 50, at least 100, at least 125, at least 500, or at least 1000 continuous 
nucleotides but are less than or equal to 300 kb, 200 kb, 100 kb, 50 kb, 15 kb, 10 kb, 7.5kb, 5 
kb, 2.5 kb, 2.0 kb, or 1 kb, in length. In a further embodiment, polynucleotides of the 
invention comprise a portion of the coding sequences, as disclosed herein, but do not 
comprise all or a portion of any intron. In another embodiment, the polynucleotides 
comprising coding sequences do not contain coding sequences of a genomic flanking gene 
(i.e., 5' or 3' to the gene of interest in the genome). In other embodiments, the 
polynucleotides of the invention do not contain the coding sequence of more than 1000, 500, 
250, 100, 50, 25, 20, 15, 10, 5, 4, 3, 2, or 1 genomic flanking gene(s). 
[24] A "polynucleotide" of the present invention also includes those polynucleotides 
capable of hybridizing, under stringent hybridization conditions, to sequences contained in 
SEQ ID NO:X, or the complement thereof (e.g., the complement of any one, two, three, four, 
or more of the polynucleotide fragments described herein), the polynucleotide sequence 
delineated in columns 8 and 9 of Table 2 or the complement thereof, and/or cDNA sequences 
contained in Clone ID NO:Z (e.g., the complement of any one, two, three, four, or more of 
the polynucleotide fragments, or the cDNA clone within the pool of cDNA clones deposited 
with the ATCC, described herein), and/or the polynucleotide sequence delineated in column 
6 of Table IB or the complement thereof. "Stringent hybridization conditions" refers to an 
overnight incubation at 42 degree C in a solution comprising 50% formamide, 5x SSC (750 
mM NaCl, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's 
solution, 10% dextran sulfate, and 20 ng/ml denatured, sheared salmon sperm DNA, 
followed by washing the filters in O.lx SSC at about 65 degree C. 

[25] Also contemplated are nucleic acid molecules that hybridize to the polynucleotides 
of the present invention at lower stringency hybridization conditions. Changes in the 
stringency of hybridization and signal detection are primarily accomplished through the 
manipulation of formamide concentration (lower percentages of formamide result in lowered 
stringency); salt conditions, or temperature. For example, lower stringency conditions 
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include an overnight incubation at 37 degree C in a solution comprising 6X SSPE (20X SSPE 
= 3M NaCl; 0.2M NaH 2 P0 4 ; 0.02M EDTA, pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml 
salmon sperm blocking DNA; followed by washes at 50 degree C with 1XSSPE, 0.1% SDS. 
In addition, to achieve even lower stringency, washes performed following stringent 
hybridization can be done at higher salt concentrations (e.g. 5X SSC). 
[26] Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress background in 
hybridization experiments. Typical blocking reagents include Denhardfs reagent, BLOTTO, 
heparin, denatured salmon sperm DNA, and commercially available proprietary formulations. 
The inclusion of specific blocking reagents may require modification of the hybridization 
conditions described above, due to problems with compatibility. 

[27] Of course, a polynucleotide which hybridizes only to polyA+ sequences (such as 
any 3' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 
complementary stretch of T (or U) residues, would not be included in the definition of 
"polynucleotide, " since such a polynucleotide would hybridize to any nucleic acid molecule 
containing a poly (A) stretch or the complement thereof (e.g., practically any double-stranded 
cDNA clone generated using oligo dT as a primer). 

[28] The polynucleotide of the present invention can be composed of any 
polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or DNA or 
modified RNA or DNA. For example, polynucleotides can be composed of single- and 
double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- 
and double-stranded RNA, and RNA that is mixture of single- and double-stranded regions, 
hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, 
double-stranded or a mixture of single- and double-stranded regions. In addition, the 
polynucleotide can be composed of triple-stranded regions comprising RNA or DNA or both 
RNA and DNA. A polynucleotide may also contain one or more modified bases or DNA or 
RNA backbones modified for stability or for other reasons. "Modified" bases include, for 
example, tritylated bases and unusual bases such as inosine. A variety of modifications can 
be made to DNA and RNA; thus, "polynucleotide" embraces chemically, enzymatically, or 
metabolically modified forms. 

[29] The polypeptide of the present invention can be composed of amino acids joined to 
each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and may 
contain amino acids other than the 20 gene-encoded amino acids. The polypeptides may be 
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modified by either natural processes, such as posttranslational processing, or by chemical 
modification techniques which are well known in the art. Such modifications are well 
described in basic texts and in more detailed monographs, as well as in a voluminous 
research literature. Modifications can occur anywhere in a polypeptide, including the peptide 
backbone, the amino acid side-chains and the amino or carboxyl termini. It will be 
appreciated that the same type of modification may be present in the same or varying degrees 
at several sites in a given polypeptide. Also, a given polypeptide may contain many types of 
modifications. Polypeptides may be branched, for example, as a result of ubiquitination, and 
they may be cyclic, with or without branching. Cyclic, branched, and branched cyclic 
polypeptides may result from posttranslation natural processes or may be made by synthetic 
methods. Modifications include acetylation, acylation, ADP-ribosylation, amidation, 
covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of 
a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, 
covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond 
formation, demethylation, formation of covalent cross-links, formation of cysteine, formation 
of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, 
hydroxylation, iodination, methylation, myristoylation, oxidation, pegylation, proteolytic 
processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer- 
RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination. 
(See, for instance, PROTEINS - STRUCTURE AND MOLECULAR PROPERTIES, 2nd 
Ed., T. E. Creighton, W. H. Freeman and Company, New York (1993); 
POSTTRANSLATIONAL COVALENT MODIFICATION OF PROTEINS, B. C. Johnson, 
Ed., Academic Press, New York, pgs. 1-12 (1983); Seifter et al., Meth. Enzymol. 182:626- 
646 (1990); Rattan et al., Ann. N.Y. Acad. Sci. 663:48-62 (1992)). 

[30] "SEQ ID NO:X" refers to a polynucleotide sequence described, for example, in 
Tables 1 Aor 2, while "SEQ ID NO:Y" refers to a polypeptide sequence described in column 
6 of Table 1A. SEQ ID NO:X is identified by an integer specified in column 4 of Table 1A. 
The polypeptide sequence SEQ ID NO:Y is a translated open reading frame (ORF) encoded 
by polynucleotide SEQ ID NO:X. "Clone ID NO:Z" refers to a cDNA clone described in 
column 2 of Table 1 A. 

[31] "A polypeptide having functional activity" refers to a polypeptide capable of 
displaying one or more known functional activities associated with a full-length (complete) 
protein. Such functional activities include, but are not limited to, biological activity, 
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antigenicity [ability to bind (or compete with a polypeptide for binding) to an anti- 
polypeptide antibody], immunogenicity (ability to generate antibody which binds to a 
specific polypeptide of the invention), ability to form multimers with polypeptides of the 
invention, and ability to bind to a receptor or ligand for a polypeptide. 
[32] The polypeptides of the invention can be assayed for functional activity (e.g. 
biological activity) using or routinely modifying assays known in the art, as well as assays 
described herein. Specifically, one of skill in the art may routinely assay human polypeptides 
(including fragments and variants) of the invention for activity using assays as described in 
the examples section below. 

[33] "A polypeptide having biological activity" refers to a polypeptide exhibiting 
activity similar to, but not necessarily identical to, an activity of a polypeptide of the present 
invention, including mature forms, as measured in a particular biological assay, with or 
without dose dependency. In the case where dose dependency does exist, it need not be 
identical to that of the polypeptide, but rather substantially similar to the dose-dependence in 
a given activity as compared to the polypeptide of the present invention (i.e., the candidate 
polypeptide will exhibit greater activity or not more than about 25 -fold less and, preferably, 
not more than about tenfold less activity, and most preferably, not more than about three-fold 
less activity relative to the polypeptide of the present invention). 

[34] Table 1 A summarizes some of the polynucleotides encompassed by the invention 
(including contig sequences (SEQ ID NO:X) and clones (Clone ID NO:Z) and further 
summarizes certain characteristics of these polynucleotides and the polypeptides encoded 
thereby. 
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Val-51 to Arg-56, ■ 
Ala-127 to Asp-133, 
Val-147 to Glu-153. 


Gly-8 to Cys-13, 
GIn-38 to Met-48, 
Arg-76 to Gln-82, 
Cys-87 to Asp-94. 


Cys-22toCys-31, 
Leu-35 to Pro-54, 
Gln-59 to Glu-73, 
Arg-131 toMet-140, 
Asn-149 to Arg-156, 
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Pro-1 to Asp-10, 
Arg-24 to Leu-42, 
Val-65 to Ser-75, 
Arg-95 to Asp-104, 
Glu-111 tolle-119, 
Asp-151 to Arg-157, 
His-212 to Leu-222, 
Ala-251toGly-256, 
His-309to Ala-314, 
GIy-321 to Gln-331. 


Pro-1 to Asp-10, 
Arg-24 to Leu-42, • 
Val-65 to Ser-75, 
Arg-95 to Asp-104, 
Glu-llltone-119, 
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Leu-33 to Gin-39. 


Asn-15 to Glu-24. 
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Glu-51toPhe-60, 
Gln-63 to G!y-73, 
Thr-85 to Lys-91. 


Asp-77 to Lys-82. 


Pro-1 to Lys-13, 
Pro-20 to Lys-39, 
Ala-46 to Thr-71, 
Pro:il2toGln-122, 
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Pro-8 to Gly-26, 
Cys-54 to Cys-66, 
Gly-73 to His-85. 


Ser-31 to Gly-43, 
Ser-45 to Gly-57. 


Ser-37 to Gly-49, 
Ser-51 to Gly-63, 
Val-93 to Cys-98. 
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Thr-7 to Pro-18, 
Thr-235 to Gly-240. 
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Gly-1 toPro-11, 
Ser-39toThr-53. 


Met-77 to Asn-92. 


Gly-12 to Gly-20, 
Ser-86 to Glu-94, 
Pro-103 toPro-110. 
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Ser-123 to Gly-136. 




Phe-15 to Glu-24. 


Pro-26 to Ala-38, 
Lys-85 to Gly-97, 
Tyr-120toGlu-131, 
Asp-158 to Leu-168, 
Asn-187 to Gly-197, 
Ser-204 to Asp-209. 
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Asn-48 to Gly-55, 
Thr-81 to Asn-89. 
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Gly-40 to Gly-46, 
Gln-60 to Arg-69, 
Lys-84toTrp-91, 
Leu-112toArg-118. 


Ser-15 to Tyr-24, 
Met-47 to Tyr-56, 
Gly-127 to Ser-133. 
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[35] The first column in Table 1A provides the gene number in the application 
corresponding to the clone identifier. The second column in Table 1A provides a unique 
"Clone ID NO:Z" for a cDNA clone related to each contig sequence disclosed in Table 1 A. 
This clone ID references the cDNA clone which contains at least the 5' most sequence of the 
assembled contig and at least a portion of SEQ ID NO:X was determined by directly 
sequencing the referenced clone. The reference clone may have more sequence than 
described in the sequence listing or the clone may have less. In the vast majority of cases, 
however, the clone is believed to encode a full-length polypeptide. In the case where a clone 
is not full-length, a full-length cDNA can be obtained by methods described elsewhere 
herein. 

[36] The third column in Table 1A provides a unique "Contig ID" identification for 
each contig sequence. The fourth column provides the "SEQ ID NO:" identifier for each of 
the contig polynucleotide sequences disclosed in Table 1 A. The fifth column, "ORF (From- 
To)", provides the location (i.e., nucleotide position numbers) within the polynucleotide 
sequence "SEQ ID NO:X" that delineate the preferred open reading frame (ORF) shown in 
the sequence listing and referenced in Table 1A, column 6, as SEQ ID NO:Y. Where the 
nucleotide position number "To" is lower than the nucleotide position number "From", the 
preferred ORF is the reverse complement of the referenced polynucleotide sequence. 
[37] The sixth column in Table 1 A provides the corresponding SEQ ID NO:Y for the 
polypeptide sequence encoded by the preferred ORF delineated in column 5. In one 
embodiment, the invention provides an amino acid sequence comprising, or alternatively 
consisting of, a polypeptide encoded by the portion of SEQ ID NO:X delineated by "ORF 
(From-To)". Also provided are polynucleotides encoding such amino acid sequences and the 
complementary strand thereto. 

[38] Column 7 in Table 1A lists residues comprising epitopes contained in the 
polypeptides encoded by the preferred ORF (SEQ ID NO:Y), as predicted using the 
algorithm of Jameson and Wolf, (1988) Comp. Appl. Biosci. 4:181-186. The Jameson-Wolf 
antigenic analysis was performed using the computer program PROTEAN (Version 3.11 for 
the Power Macintosh, DNASTAR, Inc., 1228 South Park Street Madison, WI). In specific 
embodiments, polypeptides of the invention comprise, or alternatively consist of, at least one, 
two, three, four, five or more of the predicted epitopes as described in Table 1A. It will be 
appreciated that depending on the analytical criteria used to predict antigenic determinants, 
the exact address of the determinant may vary slightly. 
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[39] Column 8 in Table 1 A provides an expression profile and library code: count for 
each of the contig sequences (SEQ ID NO:X) disclosed in Table 1A, which can routinely be 
combined with the information provided in Table 4 and used to determine the tissues, cells, 
and/or cell line libraries which predominantly express the polynucleotides of the invention. 
The first number in column 8 (preceding the colon), represents the tissue/cell source 
identifier code corresponding to the code and description provided in Table 4. For those 
identifier codes in which the first two letters are not "AR", the second number in column 8 
(following the colon) represents the number of times a sequence corresponding to the 
reference polynucleotide sequence was identified in the tissue/cell source. Those tissue/cell 
source identifier codes in which the first two letters are "AR" designate information 
generated using DNA array technology. Utilizing this technology, cDNAs were amplified by 
PCR and then transferred, in duplicate, onto the array. Gene expression was assayed through 
hybridization of first strand cDNA probes to the DNA array. cDNA probes were generated 
from total RNA extracted from a variety of different tissues and cell lines. Probe synthesis 
was performed in the presence of 33 P dCTP, using oligo(dT) to prime reverse transcription. 
After hybridization, high stringency washing conditions were employed to remove non- 
specific hybrids from the array. The remaining signal, emanating from each gene target, was 
measured using a Phosphorimager. Gene expression was reported as Phosphor Stimulating 
Luminescence (PSL) which reflects the level of phosphor signal generated from the probe 
hybridized to each of the gene targets represented on the array. A local background signal 
subtraction was performed before the total signal generated from each array was used to 
normalize gene expression between the different hybridizations. The value presented after 
"[array code]:" represents the mean of the duplicate values, following background subtraction 
and probe normalization. One of skill in the art could routinely use this information to 
identify normal and/or diseased tissue(s) which show a predominant expression pattern of the 
corresponding polynucleotide of the invention or to identify polynucleotides which show 
predominant and/or specific tissue and/or cell expression. 

[40] Column 9 in Table 1A provides a chromosomal map location for certain 
polynucleotides of the invention. Chromosomal location was determined by finding exact 
matches to EST and cDNA sequences contained in the NCBI (National Center for 
Biotechnology Information) UniGene database. Each sequence in the UniGene database is 
assigned to a "cluster"; all of the ESTs, cDNAs, and STSs in a cluster are believed to be 
derived from a single gene. Chromosomal mapping data is often available for one or more 
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sequence(s) in a UniGene cluster; this data (if consistent), is then applied to the cluster as a 
whole. Thus, it is possible to infer the chromosomal location of a new polynucleotide 
sequence by determining its identity with a mapped UniGene cluster. 
[41J A modified version of the computer program BLASTN (Altshul et al., J. Mol. 
Biol. 215:403-410 (1990); and Gish and States, Nat Genet. 3:266-272 (1993)) was used to 
search the UniGene database for EST or cDNA sequences that contain exact or near-exact 
matches to a polynucleotide sequence of the invention (the 'Query'). A sequence from the 
UniGene database (the 'Subject') was said to be an exact match if it contained a segment of 
50 nucleotides in length such that 48 of those nucleotides were in the same order as found in 
the Query sequence. If all of the matches that met this criteria were in the same UniGene 
cluster, and mapping data was available for this cluster, it is indicated in Table 1 A under the 
heading "Cytologic Band". Where a cluster had been further localized to a distinct cytologic 
band, that band is disclosed; where no banding information was available, but the gene had 
been localized to a single chromosome, the chromosome is disclosed. 
[42] Once a presumptive chromosomal location was determined for a polynucleotide of 
the invention, an associated disease locus was identified by comparison with a database of 
diseases which have been experimentally associated with genetic loci. The database used was 
the Morbid Map, derived from OMIM™ (supra). If the putative chromosomal location of a 
polynucleotide of the invention (Query sequence) was associated with a disease in the 
Morbid Map database, an OMIM reference identification number was noted in column 10, 
Table 1 A, labelled "OMIM Disease Reference(s)". Table 5 is a key to the OMIM reference 
identification numbers (column 1), and provides a description of the associated disease in 
Column 2. 
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[43] Table IB summarizes additional polynucleotides encompassed by the invention 
(including cDNA clones related to the sequences (Clone ID NO:Z), contig sequences (contig 
identifier (Contig ID:) contig nucleotide sequence identifiers (SEQ ID NO:X)), and genomic 
sequences (SEQ ID NO:B). The first column provides a unique clone identifier, "Clone ID 
NO.Z", for a cDNA clone related to each contig sequence. The second column provides the 
sequence identifier, "SEQ ID NO:X", for each contig sequence. The third column provides a 
unique contig identifier, "Contig ID:" for each contig sequence. The fourth column, provides 
a BAG identifier "BAC ID NO:A" for the BAC clone referenced in the corresponding row of 
the table. The fifth column provides the nucleotide sequence identifier, "SEQ ID NO:B" for 
a fragment of the BAC clone identified in column four of the corresponding row of the table. 
The sixth column, "Exon From-To", provides the location (i.e., nucleotide position numbers) 
within the polynucleotide sequence of SEQ ID NO:B which delineate certain polynucleotides 
of the invention that are also exemplary members of polynucleotide sequences that encode 
polypeptides of the invention (e.g., polypeptides containing amino acid sequences encoded 
by the polynucleotide sequences delineated in column six, and fragments and variants 
thereof). 
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